# High-precision quantization

Gama 12b GGUF
Gama-12B is a large language model supporting multiple languages, offering various quantized versions to meet different performance and precision requirements.
Large Language Model Transformers Supports Multiple Languages
G
mradermacher
185
1
Acereason Nemotron 1.1 7B GGUF
Other
A high-performance 7B parameter language model launched by NVIDIA, focusing on mathematical and code reasoning tasks and supporting a 128k context length.
Large Language Model Supports Multiple Languages
A
lmstudio-community
278
1
Delta Vector Austral 24B Winton GGUF
Apache-2.0
A quantized version of the Austral-24B-Winton model of Delta-Vector, quantized using the llama.cpp tool, suitable for efficient operation on different hardware configurations.
Large Language Model English
D
bartowski
421
1
Openbuddy OpenBuddy R1 0528 Distill Qwen3 32B Preview0 QAT GGUF
Apache-2.0
This is the quantized version of OpenBuddy-R1-0528-Distill-Qwen3-32B-Preview0-QAT. With quantization technology, the model can run more efficiently under different hardware conditions.
Large Language Model Supports Multiple Languages
O
bartowski
720
1
Infly Inf O1 Pi0 GGUF
A quantized version based on the infly/inf-o1-pi0 model, supporting multilingual text generation tasks, optimized with llama.cpp's imatrix quantization.
Large Language Model Supports Multiple Languages
I
bartowski
301
1
Pocketdoc Dans PersonalityEngine V1.3.0 24b GGUF
Apache-2.0
A multilingual, multi-purpose large language model supporting various professional domains and general tasks, suitable for role-playing, story creation, programming, and other scenarios.
Large Language Model
P
bartowski
2,543
4
Gryphe Pantheon Proto RP 1.8 30B A3B GGUF
Apache-2.0
This is a quantized version based on the Gryphe/Pantheon-Proto-RP-1.8-30B-A3B model, using llama.cpp for quantization, suitable for role-playing and text generation tasks.
Large Language Model English
G
bartowski
2,972
6
Huihui Ai Qwen3 14B Abliterated GGUF
Apache-2.0
Qwen3-14B-abliterated is a quantized version based on the Qwen3-14B model, optimized using llama.cpp, offering multiple quantization options to meet different performance requirements.
Large Language Model
H
bartowski
6,097
5
Mlabonne Qwen3 14B Abliterated GGUF
This is the quantized version of the Qwen3-14B-abliterated model, quantized using llama.cpp's imatrix option, suitable for text generation tasks.
Large Language Model
M
bartowski
18.67k
16
Qwen Qwen3 32B GGUF
Apache-2.0
Quantized version based on Qwen/Qwen3-32B, using llama.cpp for quantization, supporting multiple quantization types for different hardware requirements.
Large Language Model
Q
bartowski
49.13k
35
Nvidia OpenMath Nemotron 14B Kaggle GGUF
This is a 14B-parameter large mathematical language model open-sourced by NVIDIA. It has been quantized by llama.cpp and can run efficiently under different hardware conditions.
Large Language Model English
N
bartowski
432
1
Mistral Small 24B Instruct 2501 GGUF
Apache-2.0
Mistral-Small-24B-Instruct-2501 is a 24B-parameter instruction-finetuned large language model supporting multilingual text generation tasks.
Large Language Model Supports Multiple Languages
M
bartowski
48.61k
111
Pocketdoc Dans SakuraKaze V1.0.0 12b GGUF
Apache-2.0
Llamacpp imatrix quantized version based on PocketDoc/Dans-SakuraKaze-V1.0.0-12b, supporting multiple quantization types, suitable for text generation tasks.
Large Language Model English
P
bartowski
788
3
Llama 3.3 70B Instruct Abliterated GGUF
A 70B-parameter large language model based on the Llama 3.3 architecture, supporting multilingual text generation tasks, optimized through quantization for various hardware environments
Large Language Model Supports Multiple Languages
L
bartowski
7,786
8
Zero Mistral 24B Gguf
MIT
Zero-Mistral-24B is a large language model based on the Mistral architecture, supporting Russian and English, suitable for dialogue and text generation tasks.
Large Language Model Supports Multiple Languages
Z
ZeroAgency
613
3
Google Gemma 3 27b It Qat GGUF
A quantized version based on Google Gemma 3's 27-billion parameter instruction-tuned model, generated using quantization-aware training (QAT) weights, supporting multiple quantization levels to meet different hardware requirements.
Large Language Model
G
bartowski
14.97k
31
Nvidia Llama 3 1 Nemotron Ultra 253B V1 GGUF
Other
This is the quantized version of the NVIDIA Llama-3_1-Nemotron-Ultra-253B-v1 model, quantized using llama.cpp, supporting multiple quantization types and suitable for various hardware environments.
Large Language Model English
N
bartowski
1,607
3
Llama 4 Scout 17B 16E Instruct GGUF
Other
Llama-4-Scout-17B-16E-Instruct is a multilingual instruction fine-tuning model that supports multiple languages and can be run through LlamaEdge.
Large Language Model Transformers Supports Multiple Languages
L
second-state
2,959
0
Qwen Qwen2.5 VL 32B Instruct GGUF
Apache-2.0
Qwen2.5-VL-32B-Instruct is a multimodal vision-language model with a parameter scale of 32B, supporting image understanding and text generation tasks.
Text-to-Image English
Q
bartowski
2,782
1
Gemma 3 R1984 27B Q6 K GGUF
GGUF format model converted from VIDraft/Gemma-3-R1984-27B, supporting multilingual text generation
Large Language Model Supports Multiple Languages
G
GrimsenClory
28
1
Mlabonne Gemma 3 12b It Abliterated GGUF
Quantized version based on mlabonne/gemma-3-12b-it-abliterated model, using llama.cpp for imatrix quantization, suitable for text generation tasks.
Large Language Model
M
bartowski
7,951
6
Gemma 3 12b It Q8 0 GGUF
This model is converted from google/gemma-3-12b-it to GGUF format, suitable for the llama.cpp framework.
Large Language Model
G
NikolayKozloff
89
1
Gemma 3 27b It GGUF
Gemma-3-27b-it-GGUF is a quantized version based on Google's Gemma-3-27b-it model, suitable for image text-to-text tasks.
Text-to-Image Transformers
G
second-state
2,024
0
Gemma 3 4b It GGUF
Gemma-3-4b-it-GGUF is a quantized version of Google's Gemma-3-4b-it model, enabling it to run on LlamaEdge and suitable for image-text to text conversion tasks.
Transformers
G
second-state
2,120
0
Rombo Org Rombo LLM V3.1 QWQ 32b GGUF
Apache-2.0
Rombo-LLM-V3.1-QWQ-32b is a 32B-parameter large language model, processed with llama.cpp's imatrix quantization, offering multiple quantization versions to accommodate different hardware requirements.
Large Language Model
R
bartowski
2,132
5
Thedrummer Skyfall 36B V2 GGUF
Other
Skyfall-36B-v2 is a 36B-parameter large language model, processed with llama.cpp's imatrix quantization, offering multiple quantized versions to accommodate different hardware requirements.
Large Language Model
T
bartowski
40.42k
11
L3.3 MS Nevoria 70b GGUF
A quantized version based on the Steelskull/L3.3-MS-Nevoria-70b model, using llama.cpp for imatrix quantization, supporting multiple quantization levels for different hardware environments.
Large Language Model
L
bartowski
5,252
12
Sky T1 32B Preview GGUF
Sky-T1-32B-Preview is a 32B-parameter large language model, quantized using llama.cpp's imatrix, suitable for text generation tasks.
Large Language Model English
S
bartowski
1,069
81
EXAONE 3.5 32B Instruct GGUF
Other
EXAONE-3.5-32B-Instruct is a large language model with 32B parameters, supporting instruction following and dialogue tasks.
Large Language Model Supports Multiple Languages
E
bartowski
616
9
Impish Mind 8B GGUF
Apache-2.0
Quantized version based on SicariusSicariiStuff/Impish_Mind_8B model, processed with llama.cpp tools for various quantization methods, suitable for text generation tasks.
Large Language Model English
I
bartowski
532
9
Mini Magnum 12b V1.1 GGUF
Other
Mini-Magnum-12B-V1.1 is a text generation model built on the basis of the intervitens/mini-magnum-12b-v1.1 base model, supporting English and adopting a specific quantization method.
Large Language Model English
M
Reiterate3680
252
2
Gemma 2 27b It Q8 0 GGUF
This is a GGUF format model converted from Google's Gemma 2B model, suitable for text generation tasks.
Large Language Model
G
KimChen
471
2
Darksapling V2 Ultra Quality 7B GGUF
Apache-2.0
A version completely remerged and remade based on the Dark Sapling V2 7B model, with a 32k context length, featuring ultra-high quality and 32-bit improvement
Large Language Model English
D
DavidAU
385
3
Llama 3 Cat 8b Instruct V1 GGUF
This is an 8B parameter instruction fine-tuned model based on Meta's Llama 3 architecture, processed with GGUF quantization, suitable for resource-constrained environments.
Large Language Model
L
bartowski
909
12
Mixtral 8x22B V0.1 GGUF
Apache-2.0
Quantized version of Mixtral-8x22B-v0.1, using llama.cpp for quantization, supporting multiple languages and quantization types.
Large Language Model Supports Multiple Languages
M
bartowski
597
12
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase